Multi-relational data mining in Microsoft SQL
نویسندگان
چکیده
Most real life data are relational by nature. Database mining integration is an essential goal to be achieved. Microsoft SQL Server (MSSQL) seems to provide an interesting and promising environment to develop aggregated multi-relational data mining algorithms by using nested tables and the plug-in algorithm approach. However, it is currently unclear how these nested tables can best be used by data mining algorithms. In this paper we look at how the Microsoft Decision Trees (MSDT) handles multi-relational data, and we compare it with the multi-relational decision tree learner TILDE. In the experiments we perform, MSDT has equally good predictive accuracy as TILDE, but the trees it gives either ignore the relational information, or use it in a way that yields noninterpretable trees. As such, one could say that its explanatory power is reduced, when compared to a multi-relational decision tree learner. We conclude that it may be worthwhile to integrate a multi-relational decision tree learner in MSSQL.
منابع مشابه
Integration of Data Mining with Database Technology
In this paper, we review the past work and discuss the future of integration of data mining and relational database systems. We also discuss support for integration in Microsoft SQL Server 2000.
متن کاملEfficient Evaluation of Queries with Mining Predicates
Modern relational database systems are beginning to support ad hoc queries on mining models. In this paper, we explore novel techniques for optimizing queries that apply mining models to relational data. For such queries, we use the internal structure of the mining model to automatically derive traditional database predicates. We present algorithms for deriving such predicates for some popular ...
متن کاملMining Generalized Association Rules and Sequential Patterns Using SQL Queries
Database integration of mining is becoming increasingly important with tile installation of larger and larger data warehouses built around relational database technology. Most of the commercially available mining systems integrate loosely (typically, through an ODBC or SQL cursor interface) with data stored in DBMSs. In cases where the mining algorithm makes nmltiple passes over the data, it is...
متن کاملSQL Based Association Rule Mining Using Commercial RDBMS (IBM DB2 UBD EEE)
Data mining is becoming increasingly important since the size of databases grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently database systems are dominated by relational database and the ability to perform data mining using standard SQL queries will definitely ease implementation of data mining. However the performance of SQL based da...
متن کاملSQL Based Association Rule Mining using Commercial RDBMS (IBM DB2 UDB EEE)
Data mining is becoming increasingly important since the size of databases grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently database systems are dominated by relational database and the ability to perform data mining using standard SQL queries will definitely ease implementation of data mining. However the performance of SQL based da...
متن کامل